AWS VPC Tutorial
What is Amazon Virtual Private Cloud?
Amazon Virtual Private Cloud (VPC) is the networking layer of EC2. Using VPC, customers can create isolated private networks within the AWS public cloud.
What is a virtual private cloud?
A virtual private cloud (VPC) is an isolated group of computing resources within a public cloud. This group of resources is kept private through the allocation of private IP subnets made accessible through a secured connection like VPN.
Using a VPC, customers can create private networks where access to resources are more secured. By defining subnets, route tables, and gateways, customers can dictate network traffic flow and control which services are public (accessible through the internet) and private (accessible only within the VPC).
These types of restrictions make sense when you have a combination of public facing services (web servers) and private facing services (databases, application servers). While you're web services obviously need public internet access, your backend systems can benefit from more limited private access.
What is a public cloud?
A public cloud is a shared collection of computing infrastructure. Different customers share the same resources without sharing data. This is known as "multitenancy". AWS is an example of a public cloud.
What is a private cloud?
A private cloud is only available to a select group of users. Access to the computing resources on a private cloud are restricted to the organization hosting the cloud. "On premise" data centers and corporate networks are examples of private clouds. These are known as "single tenant" systems.
When you use a private cloud, you know the resources you're accessing can't be accessed by anyone outside the organization.
What is a VPC?
A VPC is a virtualized private cloud. It's a private cloud that is "virtually" hosted by a third-party vendor on behalf of the customer. Using a VPC, customers get the security benefits of a private cloud with the scalability and convenience of a public cloud.
The AWS VPC is a good example. AWS VPC allows customers to define private/public subnets and create isolated virtual local area networks (VLANS) using AWS public cloud resources.
Virtual private cloud vs public cloud
A VPC is a private cloud running on a public cloud. Using an individualized IP subnet, a VPC creates a private, isolated network (VLAN) within publicly hosted infrastructure. Services can privately communicate within this VLAN by means of a secured VPN connection. This makes it possible to host a "private cloud" within the "public cloud".
Virtual private cloud vs private cloud
A VPC is a private cloud...it's just hosted within a public cloud. When you use a VPC, you're using a private cloud.
How does a virtual private cloud work?
Understanding how a VPC works requires some knowledge around networking. The following concepts are critical to explaining how a VPC works:
IP Addresses
An IP address uniquely identifies a device on a network. Following is an example of an IPv4 address:
156.67.154.16
This dot notation is a fancy representation of a 32 bit number...
156.67.154.16
10011100.01000011.10011010.00010000
The dots are used to separate this number into 4 sections (octets) for better readability. At the end of the day, it's just a number...
IPv4 vs IPv6
IPv4 addresses are 32 bit numbers. They've been used for 20+ years to identify clients on the internet.
While a 32 bit number provides over 4 billion unique addresses, it's only a matter of time before all of these unique identifiers are used. As a result, IPv6 addresses have been introduced to solve the problem. These addresses are 128 bit numbers and look like this:
2001:0000:3238:DFE1:0063:0000:0000:FEFB
An IPv6 address is divided into eight 16-bit blocks separated by a colon :. Each block is represented by a 4-digit hexadecimal number.
Networks can adopt both IPv4 and IPv6. While IPv6 certainly prepares your network for the future, you'll be leaving out many IPv4 clients without a NAT64/DNS64 gateway...
An IP address identifies a host on a network. This means an IP address has both a network address and a host address.
So how do you know which part is which.......
Subnet mask
A subnet mask is a 32 bit number used to identify network bits from host bits:
255.255.255.0
Just like IP addresses, we use dot notation and base 10 for easier readability. If you translate this to binary you get:
11111111.11111111.11111111.00000000
Every 1 represents a network bit. Every 0 represents a host bit. From this subnet mask, you can determine that the first 24 bits are for identifying the network and 8 bits are for identifying the host.
So if we have an IP address like this...
156.67.154.16
with a subnet mask like this...
255.255.255.0
then we can identify 156.67.154.0 as the network address and 0.0.0.16 as the host address.
Furthermore, since the last 8 bits are used for host identification, we can define 256 hosts for this network address (254 if you take into account the reserved hosts for identifying the address).
CIDR
IP addresses were originally grouped into classes. These classes had a predetermined set of bits for networking/host. For example class C addresses have 254 available hosts, class B addresses have 65,535 hosts etc.
The advantage of this is you didn't need a subnet mask to determine network/host bits. This was inferred based on the class.
The disadvantage of this was it was too generic for most organizations and wasted unused addresses. For example, a company might need more than 254 hosts but nowhere close to 65,000 hosts.
CIDR is the newer method of IP addressing. CIDR allows for variable-length subnet masking and looks like this:
192.0.0.0/24
192.0.0.0 represents the network address. /24 represents the number of bits used for the network address.
Since 24 bits are used for the network address, we know that only 8 bits are used for the host identification.
Subnet
A subnet is a network inside a network. "Subnetting" is the process of dividing a network into subnets.
Subnets are a way to organize network traffic so that data travels more quickly within the network. Using subnets, hosts can be grouped more closely together. This reduces unnecessary network traffic and congestion, especially when certain hosts communicate a lot with one another.
Subnets also make more efficient use of an IP address. Rather than being stuck with too many or too few predetermined hosts (like with the old class system), subnets can be defined with a variable number of network and host addresses.
How does subnetting work?
To create subnets from a unique IP address, host bits are "borrowed" to create new networks. Take the following ip address:
156.67.154.16
and its subnet mask...
255.255.255.0
and its subnet mask converted to binary...
111111111.11111111.11111111.00000000
Remember the subnet mask indicates the first 24 bits are for identifying the network and the last 8 bits are for the host.
Now lets borrow two of the host bits for subnetting...
111111111.11111111.11111111.11000000
By using 2 of the host bits as network bits, we effectively create 4 new networks....
111111111.11111111.11111111.00000000
111111111.11111111.11111111.01000000
111111111.11111111.11111111.10000000
111111111.11111111.11111111.11000000
Now our subnet mask looks like this:
255.255.255.192
The number of bits "borrowed" can vary. If 3 bits are borrowed, then 8 subnets are created etc.
What is a private subnet?
A private subnet contains resources with private IP addresses.
What is a public subnet?
A public subnet contains resources with public IP addresses.
Route table
Networks communicate by sending packets of data to one another. These packets contain the destination IP address aka where the packet will be sent.
Routers exist in networks to "route" traffic to the right place. Route tables are how routers decide where to send these packets. Route tables exist in RAM and look something like this:
Subnet Mask | Network Address | Next Hop Address |
---|---|---|
255.255.255.240 | 192.17.7.208 | 192.12.7.15 |
255.255.255.240 | 192.17.7.144 | 192.12.7.67 |
255.255.255.0 | 192.17.7.0 | 192.12.7.251 |
How does a route table work?
The route table takes a destination IP address and performs a bitwise AND operation with the subnet mask. If the result equals the network address, then the traffic gets forwarded to the "next hop" address.
Route tables are where these subnet masks become important. The routing table uses the subnet mask to identify which network address matches the destination IP.
Remember that the subnet mask identifies network bits(1s) from host bits(1s). Performing a bitwise AND on the destination IP and internal subnet masks essentially "maps" the IP address to an internal subnet defined within the network.
If no matching network mask is found, the traffic is forwarded to a default gateway. This gateway is typically an internet gateway.
Each subnet is associated with a route table. Multiple subnets can be associated with the same route table. Subnets can't be associated with multiple route tables.
By default, subnets are associated with the main route table.
Gateway
An internet gateway is what allows the VPC to reach the internet. A gateway is a router that connects two (dissimilar) networks.
Gateway vs Router
A router connects similar networks. Two networks using TCP/IP protocol could be connected via a router.
A gateway connects dissimilar networks. You can think of a gateway as a protocol converter. AWS VPC uses an internet gateway to connect an AWS private network to the world wide web.
Your internet service provider (ISP) is the gateway between your local home network and the internet.
A gateway can perform network address translation (NAT) and serves as the bridge between a local network and the outside world.
Virtual private network (VPN)
A VPN creates secured, encrypted communication between a local network and another location.
Think of a VPN as a more secured version of a proxy server. A proxy server masks your IP address. Any outbound traffic from your local network is forwarded to the proxy. The proxy then forwards the traffic using it's own IP address and not yours.
A VPN adds the extra encryption layer to the proxy concept. Not only is the IP masked, the data packet itself is also encrypted making it virtually impossible to identify anything about the communication.
VPN vs HTTPS proxy
SSL is a protocol for encrypting data sent over the internet. While some VPN implementations use SSL for encryption, a VPN is still considered more secure than basic HTTPS over the internet.
You could argue that a proxy with SSL achieves the same level of security as a VPN. After all, the data is encrypted via SSL and the IP is masked by the proxy right???
It turns out VPNs are still considered more secure. VPNs are implemented at the OS level and guarantee a heavily encrypted tunnel of communication across all applications.
For these reasons, a VPN is considered more secure than a regular proxy using HTTPS.
VPC Endpoint
A VPC endpoint allows you to privately connect to other services in AWS. These endpoints include gateway endpoints, interface endpoints, and load balancer endpoints.
AWS VPC endpoints allow you to connect to S3 (for instance) from your network without the use of the internet (internet gateway, NAT gateway).
Bringing it all together...
Today AWS automatically creates a default VPC for your account. This VPC includes a default IPv4 range of 172.31.0.0/16. This means you have 16 bits reserved for the network meaning you have 16 bits for your hosts. 16 bits for your host means you can support a little over 65,000 devices on your VPC!
The default VPC also creates a /20 subnet in each Availability zone. This means 4 more bits are "borrowed" for creating subnetworks. Each sub network can have a little over 4,000 host addresses.
Each subnet is public by default with the default VPC...
The default VPC includes a main route table. AWS route tables are simplified versions of real route tables. The default route table looks like this...
The 172.16.0.0/16 destination can't be deleted and allows all the resources within the VPC to talk to each other without any additional configuration.
The default VPC provides an internet gateway. Notice how this gateway is listed in the default route table as a target for the "catch all" address 0.0.0.0/.
AWS VPC Architecture
This is the official AWS architecture diagram for a VPC with a public and private subnet. Notice how the instances from each subnet can communicate internally with each other via the router.
For external communication with the internet, public subnet instances use the internet gateway. Private subnet instances can use the NAT gateway (hosted in the public subnet) to also communicate with the internet.
AWS VPC Pricing
AWS VPC is free by itself. Of course the same rates apply to the EC2 instances you're running within the VPC.
Remember that AWS automatically creates a VPC for you by default. When you deploy instances into your VPC, you won't be charged anything outside of normal rates for the underlying instances.
AWS VPC starts to cost money when you utilize Site-to-Site VPN connections, PrivateLinks (VPC endpoints), NAT gateways, and traffic mirroring.
VPN connections are billed on an hourly basis. The current rate for AWS Site-to-Site VPN is $0.05/hour. Data transferred over VPN connections is charged at standard AWS Data Transfer rates.
Data transfer charges are not incurred when you access services via the internet gateway. For example, if you access S3 from the internet gateway, you won't be charged for this.
AWS VPC Security
EC2 security groups allow you to specify inbound/outbound traffic limitations for any given instance.
You can also use access control lists (ACLs) to control inbound/outbound traffic at the subnet level.
Remember that using a VPC inherently makes things more secure. Applying ACLs to different subnets allows you to control access to specific devices and services more thoroughly.
Using a VPC makes your system more secure.
AWS VPC Tutorial: Creating a VPC with public and private subnets
The AWS VPC wizard makes it very easy to launch a new VPC. Remember that your AWS account automatically generates a default VPC for you.
These steps show you how to easily create a VP with public/private subnets using the wizard. It also explains the magic behind the wizard.
1) Launch the wizard
From the VCP dashboard, you can easily launch the wizard to create a new VPC:
2) Select configuration
You'll see a few different options for configuring your VPC. Select the second option "VPC with Public and Private subnets"
Select configuration options
Here you can specify a IPv6 CIDR block to include with your VPC. You can also specify which availability zones (AZ) your VP will run in, the public/private subnet range, etc.
Most of the defaults are fine. You will have to provide an "elastic IP" for NAT gateway of your VPC. This will allow private subnet addresses to connect to the internet.
You can easily create a new elastic IP from the EC2 dashboard.
Create the VPC
Click the button and thats it! Your VPC will take a few minutes to create, but afterwards you can see what was created from the main AWS VPC console....
What just happened?
Using the AWS VPC wizard makes things too easy. While it's super convenient, it's also important to understand what you just created...
Public/private subnetsThe VPC wizard create a public and private subnet. Notice how these subnet's fall within the IPv4 range specified:
The private subnet route table
The private subnet has it's own route table. It looks like this:
The local entry 10.0.0.0/16 allows every instance in the VPC to communicate without additional configuration.
The nat entry 0.0.0.0/0 gives the private subnet access to the internet via the NAT gateway (utilizing the elastic IP address you had to provide during setup).
With this configuration, the private route table acts as the main router table for the VPC.
The public subnet route table
The local entry 10.0.0.0/16 allows every instance in the subnet to communicate without additional configuration.
The internet gateway (igw) entry 0.0.0.0/0 allows the public subnet instances to communicate with the internet.
With this configuration, the public route table is a "custom route table".
Security configurations
The wizard creates both a network access control list (ACL) and a security group for the VPC.
Network ACL
The Network ACL is applied to both the public and private subnets. The Network ACL controls inbound/outbound traffic for subnets.
Security Group
Security groups control inbound/outbound traffic for instances.
Gateways
The wizard generates both an NAT gateway and an internet gateway for the VPC.
Both the internet gateway and NAT gateway exist in the public subnet of the VPC.
When public subnets want to communicate over the internet, they go through the internet gateway.
When private subnets want to communicate over the internet, they go through the NAT gateway.
Conclusion
So you've created a VPC with both private and public subnets...SO WHAT?
By separating your network into public and private subnets, you have better control over the inbound/outbound traffic for your network.
While private subnet limits access to the internet for backend services like application servers and databases, public subnets give web apps direct access to the internet gateway.
The VPC allows the instances in both public/private subnets to easily communicate with one another internally.
By giving you the ability to define IP ranges, subnets, route tables, etc. AWS VPC enables you to realize the benefits of a private cloud hosted on public cloud infrastructure.